Transformer models have achieved great success across many NLP problems. However, previous studies in automated ICD coding concluded that these models fail to outperform some of the earlier solutions such as CNN-based models. In this paper we challenge this conclusion. We present a simple and scalable method to process long text with the existing transformer models such as BERT. We show that this method significantly improves the previous results reported for transformer models in ICD coding, and is able to outperform one of the prominent CNN-based methods.
translated by 谷歌翻译
Performance metrics-driven context caching has a profound impact on throughput and response time in distributed context management systems for real-time context queries. This paper proposes a reinforcement learning based approach to adaptively cache context with the objective of minimizing the cost incurred by context management systems in responding to context queries. Our novel algorithms enable context queries and sub-queries to reuse and repurpose cached context in an efficient manner. This approach is distinctive to traditional data caching approaches by three main features. First, we make selective context cache admissions using no prior knowledge of the context, or the context query load. Secondly, we develop and incorporate innovative heuristic models to calculate expected performance of caching an item when making the decisions. Thirdly, our strategy defines a time-aware continuous cache action space. We present two reinforcement learning agents, a value function estimating actor-critic agent and a policy search agent using deep deterministic policy gradient method. The paper also proposes adaptive policies such as eviction and cache memory scaling to complement our objective. Our method is evaluated using a synthetically generated load of context sub-queries and a synthetic data set inspired from real world data and query samples. We further investigate optimal adaptive caching configurations under different settings. This paper presents, compares, and discusses our findings that the proposed selective caching methods reach short- and long-term cost- and performance-efficiency. The paper demonstrates that the proposed methods outperform other modes of context management such as redirector mode, and database mode, and cache all policy by up to 60% in cost efficiency.
translated by 谷歌翻译
Traffic jams occurring on highways cause increased travel time as well as increased fuel consumption and collisions. Traffic jams without a clear cause, such as an on-ramp or an accident, are called phantom traffic jams and are said to make up 50% of all traffic jams. They are the result of an unstable traffic flow caused by human driving behavior. Automating the longitudinal vehicle motion of only 5% of all cars in the flow can dissipate phantom traffic jams. However, driving automation introduces safety issues when human drivers need to take over the control from the automation. We investigated whether phantom traffic jams can be dissolved using haptic shared control. This keeps humans in the loop and thus bypasses the problem of humans' limited capacity to take over control, while benefiting from most advantages of automation. In an experiment with 24 participants in a driving simulator, we tested the effect of haptic shared control on the dynamics of traffic flow, and compared it with manual control and full automation. We also investigated the effect of two control types on participants' behavior during simulated silent automation failures. Results show that haptic shared control can help dissipating phantom traffic jams better than fully manual control but worse than full automation. We also found that haptic shared control reduces the occurrence of unsafe situations caused by silent automation failures compared to full automation. Our results suggest that haptic shared control can dissipate phantom traffic jams while preventing safety risks associated with full automation.
translated by 谷歌翻译
Autonomous vehicles currently suffer from a time-inefficient driving style caused by uncertainty about human behavior in traffic interactions. Accurate and reliable prediction models enabling more efficient trajectory planning could make autonomous vehicles more assertive in such interactions. However, the evaluation of such models is commonly oversimplistic, ignoring the asymmetric importance of prediction errors and the heterogeneity of the datasets used for testing. We examine the potential of recasting interactions between vehicles as gap acceptance scenarios and evaluating models in this structured environment. To that end, we develop a framework facilitating the evaluation of any model, by any metric, and in any scenario. We then apply this framework to state-of-the-art prediction models, which all show themselves to be unreliable in the most safety-critical situations.
translated by 谷歌翻译
从演示和成对偏好推断奖励功能是对准与人类意图的强化学习(RL)代理的吉祥方法。然而,最先进的方法通常专注于学习单一奖励模型,从而使得难以从多个专家兑换不同的奖励功能。我们提出了多目标加强主动学习(道德),这是一种将社会规范多样化示范与帕累托最优政策相结合的新方法。通过维持分布在标量化权重,我们的方法能够以各种偏好交互地调整深度RL代理,同时消除了计算多个策略的需求。我们经验展示了道德在两种情景中的有效性,该方案模拟了需要代理人在规范冲突的情况下采取行动的交付和紧急任务。总体而言,我们认为我们的研究迈出了多目标RL的一步,具有学习奖励,弥合当前奖励学习和机器伦理文学之间的差距。
translated by 谷歌翻译
自动驾驶汽车的一个主要挑战是安全,平稳地与其他交通参与者进行互动。处理此类交通交互的一种有希望的方法是为自动驾驶汽车配备与感知的控制器(IACS)。这些控制器预测,周围人类驾驶员将如何根据驾驶员模型对自动驾驶汽车的行为做出响应。但是,很少验证IACS中使用的驱动程序模型的预测有效性,这可能会限制IACS在简单的模拟环境之外的交互功能。在本文中,我们认为,除了评估IAC的互动能力外,还应在自然的人类驾驶行为上验证其潜在的驱动器模型。我们为此验证提出了一个工作流程,其中包括基于方案的数据提取和基于人为因素文献的两阶段(战术/操作)评估程序。我们在一项案例研究中证明了该工作流程,该案例研究对现有IAC复制的基于反向的基于学习的驱动程序模型。该模型仅在40%的预测中显示出正确的战术行为。该模型的操作行为与观察到的人类行为不一致。案例研究表明,有原则的评估工作流程是有用和需要的。我们认为,我们的工作流将支持为将来的自动化车辆开发适当的驾驶员模型。
translated by 谷歌翻译